Biostatistics For Dummies (Monika Wahi John Pezzullo)

A p value from the Hosmer-Lemeshow (H-L) test: In Figure 18-4a, this is listed under Hosmer-

Lemeshow Goodness of Fit Test. The null hypothesis for this test is your data are consistent with

the logistic function’s S shape, so if p < 0.05, your data do not qualify for logistic regression. The

focus of the test is to see if the S is getting distorted at very high or very low levels of the predictor

(as shown in Figure 18-4b). In Figure 18-4a, the H-L p value is 0.842, which means that the data

are consistent with the shape of a logistic curve.

One or more pseudo–r²values:Pseudo–r²values indicate how much of the total variability in the

outcome is explainable by the fitted model. They are analogous to how r² is interpreted in ordinary

least-squares regression, as described in Chapter 17. In Figure 18-4a, two such values are

provided under the labels Cox/Snell R-square and Nagelkerke R-square. The Cox/Snell r² is

0.577, and the Nagelkerke r² is 0.770, both of which indicate that a majority of the variability in

the outcome is explainable by the logistic model.

Akaike’s Information Criterion (AIC): AIC is a measure of the final model deviance

adjusted for how many predictor variables are in the model. Like deviance, the smaller the AIC,

the better the fit. The AIC is not very useful on its own, and is instead used for choosing between

different models. When all the predictors in one model are nested — or included — in another

model with more predictors, the AIC is helpful for comparing these models to see if it is worth

adding the extra predictors.

Checking out the table of regression coefficients

Your intention when developing a logistic regression model is to obtain estimates from the

table of coefficients, which looks much like the coefficients table from ordinary straight-line or

multivariate least-squares regression (see Chapters 16 and 17). In Figure 18-4a, they are listed

under Coefficients and Standard Errors. Observe:

Every predictor variable appears on a separate row.

There’s one row for the constant term labeled Intercept.

The first column usually lists the regression coefficients (under Coeff. in Figure 18-4a).

The second column usually lists the standard error (SE) of each coefficient (under StdErr in Figure

18-4a).

A p-value column indicates whether the coefficient is statistically significantly different from 0.

This column may be labeled Sig or Signif or Pr(> lzl), but in Figure 18-4a, it is labeled p-value.

For each predictor variable, the output should also provide the odds ratio (OR) and its 95 percent

confidence interval. These are usually presented in a separate table as they are in Figure 18-4a under

Odds Ratios and 95% Confidence Intervals.

Predicting probabilities with the fitted logistic formula